1/29/2015

The rOpenSci project

building tools, building community

Packages

Contributors

Community

Goals

  • Increase the availability and quality of R packages to interface with research data
  • Support reproducibility along the whole data analysis pipeline
  • Sustainable software design
  • Sustainable community

Remote sensors

credit: NASA

micro sensors

credit: NSF

NEON

OOI

Computer simulations

credit: NERSC

Field-based study

credit: Scambos & Bauer, NSIDC

Growth of climate data by type

The real challenge is doing science today that is relevant to the data tomorrow

Example: Synthetic analysis of biodiversity loss

Synthesizes over 140 data sets.

Finds no evidence for systematic loss

How easy would it be to update this to reflect new data?

What is "a lot" of data?

Engineering bottlenecks

From bottlenecks to workflows

From bottlenecks to workflows

Example workflows: Dynamic documents